FATCOP : A Fault Tolerant Condor - PVM Mixed Integer Program Solver
نویسندگان
چکیده
We describe FATCOP, a new parallel mixed integer program solver written in PVM. The implementation uses the Condor resource management system to provide a virtual machine composed of otherwise idle computers. The new solver differs from previous parallel branch-and-bound work by implementing a general purpose parallel mixed integer programming algorithm in an opportunistic multiple processor environment, as opposed to a conventional dedicated environment. It shows how to make effective use of resources as they become available while ensuring the program tolerates resource retreat. The solver performs well on test problems arising from real applications, and is particularly useful for solving long-running hard mixed integer programming problems.
منابع مشابه
FATCOP: A Fault Tolerant Condor-PVM Mixed Integer Programming Solver
We describe FATCOP, a new parallel mixed integer program solver written in PVM. The implementation uses the Condor resource management system to provide a virtual machine composed of otherwise idle computers. The solver differs from previous parallel branch-and-bound codes by implementing a general purpose parallel mixed integer programming algorithm in an opportunistic multiple processor envir...
متن کاملFATCOP 2.0: Advanced Features in an Opportunistic Mixed Integer Programming Solver
We describe FATCOP 2.0, a new parallel mixed integer program solver that works in an opportunistic computing environment provided by the Condor resource management system. We outline changes to the search strategy of FATCOP 1.0 that are necessary to improve resource utilization, together with new techniques to exploit heterogeneous resources. We detail several advanced features in the code that...
متن کاملA Large Scale Integer and Combinatorial Optimizer
The topic of this thesis, integer and combinatorial optimization, involves minimizing (or maximizing) a function of many variables, some of which belong to a discrete set, subject to constraints. This area has abundant applications in industry. Integer and combinatorial optimization problems are often difficult to solve due to the large and complex set of alternatives. The objective of this the...
متن کاملA Migration Framework for Executing Parallel Programs in the Grid
The paper describes a parallel program checkpointing mechanism and its potential application in Grid systems in order to migrate applications among Grid sites. The checkpointing mechanism can automatically (without user interaction) support generic PVM programs created by the PGRADE Grid programming environment. The developed checkpointing mechanism is general enough to be used by any Grid job ...
متن کاملFTOP: A Library for Fault Tolerance in a Cluster
Checkpointing and rollback recovery is a simple technique for fault tolerance. The state of a process is saved on a disk file from which the process can recover on the occurrence of failure. In this paper we describe the implementation of FTOP (Fault Tolerant PVM), a coordinated checkpointing library integrated with PVM. Existing PVM applications require only minor change for incorporating faul...
متن کامل